多模式信息在医疗任务中经常可用。通过结合来自多个来源的信息,临床医生可以做出更准确的判断。近年来,在临床实践中使用了多种成像技术进行视网膜分析:2D眼底照片,3D光学相干断层扫描(OCT)和3D OCT血管造影等。我们的论文研究了基于深度学习的三种多模式信息融合策略,以求解视网膜视网膜分析任务:早期融合,中间融合和分层融合。常用的早期和中间融合很简单,但不能完全利用模式之间的互补信息。我们开发了一种分层融合方法,该方法着重于将网络多个维度的特征组合在一起,并探索模式之间的相关性。这些方法分别用于使用公共伽马数据集(Felcus Photophs和OCT)以及Plexelite 9000(Carl Zeis Meditec Inc.)的私人数据集,将这些方法应用于青光眼和糖尿病性视网膜病变分类。我们的分层融合方法在病例中表现最好,并为更好的临床诊断铺平了道路。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
当前的计算模型捕获单词的含义主要取决于文本语料库。尽管这些方法在过去几十年中取得了成功,但它们在现实世界中缺乏基础仍然是一个持续的问题。在本文中,我们专注于单词嵌入的视觉接地,并针对两个重要问题。首先,在视觉接地过程中,语言如何从视觉中受益?其次,视觉接地和抽象概念之间是否存在联系?我们通过提出一种简单而有效的方法来调查这些问题,在该方法中,语言在具体和抽象词的建模方面特别受益于视觉。我们的模型将单词嵌入与其相应的视觉表示形式对齐,而不会降低文本分布信息所捕获的知识。我们将模型应用于G \“ Unther等人(2020)报告的行为实验,该实验解决了抽象单词的视觉心理表示的合理性。我们的评估结果表明:(1)可以预测人类行为(2)与文本对应物相比,我们的接地嵌入方式在很大程度上更好地模型。(3)抽象的概念通过其与具体概念的连接而不是具有相应的视觉表现方式,从而从视觉接地中受益。
translated by 谷歌翻译
聋哑人在看直播电视时经常依靠字幕来聋。实时电视字幕通过使用各种标题评估指标的监管机构评估。但是,字幕评估指标通常不会由DHH用户的偏好或字幕有多有意义。有必要构建字幕评估指标,以考虑成绩单中单词的相对重要性。我们在现有语料库中的两种类型的单词嵌入和人类宣传的单词形象分数之间进行了相关分析。我们发现,使用BERT生成的归一化情境化嵌入与基于Word2VEC的单词嵌入更好的与手动注释的重要性分数更好的相关性。我们提供了单词嵌入及其人类宣布的重要性分数的配对。我们还通过训练单词重要性模型来提供概念验证效用,在6级单词重要性分类任务中达到0.57的F1得分。
translated by 谷歌翻译
语言基础与视觉是一个积极的研究领域,旨在通过利用视觉感知知识来丰富基于文本的单词含义的表示。尽管进行了多次接地尝试,但仍不清楚如何以一种保持文本和视觉知识的适当平衡的方式将视觉知识注入语言嵌入一词。一些普遍的问题是以下内容。视觉基础对抽象单词有益吗?还是仅限于具体单词的贡献?弥合文本和视觉之间差距的最佳方法是什么?通过视觉接地的文本嵌入,我们可以获得多少收益?本研究通过提出一种简单但非常有效的基础方法来解决这些问题,以预先训练的单词嵌入。我们的模型将文本嵌入与视觉保持一致,同时在很大程度上保留了在文本语料库中使用单词使用的分布统计数据。通过应用学习的对齐方式,我们能够生成视觉接地的嵌入,用于看不见的单词,包括抽象单词。一系列对单词相似性基准的评估表明,视觉接地不仅对具体单词有益,而且对抽象单词也有益。我们还表明,我们的视觉接地方法为上下文化的嵌入提供了优势,但只有在对相对尺寸相对较小的语料库进行培训时,我们才能提供优势。可以在https://github.com/hazel1994/visaly_grounded_word_word_embeddings_2上获得英语的代码和接地嵌入。
translated by 谷歌翻译
稀疏激活的变压器(例如专家的混合物(MOE))由于其极端的缩放能力而引起了极大的兴趣,这可以使模型大小的急剧增加而没有大幅增加计算成本。为了实现这一目标,MOE模型用变压器中的Experts子层取代了前馈子层,并使用门控网络将每个令牌路由到其指定的专家。由于对此类模型进行有效培训的共同实践需要在不同的机器上分发专家和代币,因此这种路由策略通常会产生巨大的跨机器通信成本,因为代币及其分配的专家可能居住在不同的机器中。在本文中,我们提出了\ emph {门控辍学},它允许代币忽略门控网络并留在其本地机器,从而减少了交叉机器的通信。与传统辍学类似,我们还表明,门控辍学在训练过程中具有正规化效果,从而改善了概括性能。我们验证了对多语言机器翻译任务中门控辍学的有效性。我们的结果表明,门控辍学可改善具有更快的壁式时间收敛速率的最先进的MOE模型,并为各种模型尺寸和数据集提供更好的BLEU分数。
translated by 谷歌翻译
储层计算(RC)已经获得了最近的兴趣,因为无需培训储层权重,从而实现了极低的资源消费实施,这可能会对边缘计算和现场学习的影响有严格的限制。理想情况下,天然硬件储层应被动,最小,表现力和可行性。迄今为止,拟议的硬件水库很难满足所有这些标准。因此,我们建议通过利用偶极耦合,沮丧的纳米磁体的被动相互作用来符合所有这些标准的水库。挫败感大大增加了稳定的储层国家的数量,丰富了储层动力学,因此这些沮丧的纳米磁体满足了天然硬件储层的所有标准。同样,我们提出了具有低功率互补金属氧化物半导体(CMOS)电路的完全沮丧的纳米磁管储层计算(NMRC)系统与储层接口,并且初始实验结果证明了储层的可行性。在三个单独的任务上,通过微磁模拟对储层进行了验证。将所提出的系统与CMOS Echo-State网络(ESN)进行了比较,表明总体资源减少了10,000,000多倍,这表明,由于NMRC自然是被动的,而且最小的可能是具有极高资源效率的潜力。
translated by 谷歌翻译
Bipedal robots have received much attention because of the variety of motion maneuvers that they can produce, and the many applications they have in various areas including rehabilitation. One of these motion maneuvers is walking. In this study, we presented a framework for the trajectory optimization of a 5-link (planar) Biped Robot using hybrid optimization. The walking is modeled with two phases of single-stance (support) phase and the collision phase. The dynamic equations of the robot in each phase are extracted by the Lagrange method. It is assumed that the robot heel strike to the ground is full plastic. The gait is optimized with a method called hybrid optimization. The objective function of this problem is considered to be the integral of torque-squared along the trajectory, and also various constraints such as zero dynamics are satisfied without any approximation. Furthermore, in a new framework, there is presented a constraint called impact invariance, which ensures the periodicity of the time-varying trajectories. On the other hand, other constraints provide better and more human-like movement.
translated by 谷歌翻译
The importance of humanoid robots in today's world is undeniable, one of the most important features of humanoid robots is the ability to maneuver in environments such as stairs that other robots can not easily cross. A suitable algorithm to generate the path for the bipedal robot to climb is very important. In this paper, an optimization-based method to generate an optimal stairway for under-actuated bipedal robots without an ankle actuator is presented. The generated paths are based on zero and non-zero dynamics of the problem, and according to the satisfaction of the zero dynamics constraint in the problem, tracking the path is possible, in other words, the problem can be dynamically feasible. The optimization method used in the problem is a gradient-based method that has a suitable number of function evaluations for computational processing. This method can also be utilized to go down the stairs.
translated by 谷歌翻译
Finding and localizing the conceptual changes in two scenes in terms of the presence or removal of objects in two images belonging to the same scene at different times in special care applications is of great significance. This is mainly due to the fact that addition or removal of important objects for some environments can be harmful. As a result, there is a need to design a program that locates these differences using machine vision. The most important challenge of this problem is the change in lighting conditions and the presence of shadows in the scene. Therefore, the proposed methods must be resistant to these challenges. In this article, a method based on deep convolutional neural networks using transfer learning is introduced, which is trained with an intelligent data synthesis process. The results of this method are tested and presented on the dataset provided for this purpose. It is shown that the presented method is more efficient than other methods and can be used in a variety of real industrial environments.
translated by 谷歌翻译